Bayesian Mixtures of Bernoulli Distributions
نویسنده
چکیده
The mixture of Bernoulli distributions [6] is a technique that is frequently used for the modeling of binary random vectors. They differ from (restricted) Boltzmann Machines in that they do not model the marginal distribution over the binary data space X as a product of (conditional) Bernoulli distributions, but as a weighted sum of Bernoulli distributions. Despite the non-identifiability of the mixture of Bernoulli distributions [3], it has been successfully used to, e.g., dichotomous perceptual decision making [1], text classification [7], and word categorization [4]. Mixtures of Bernoulli distributions are typically trained using an expectation-maximization (EM) algorithm, i.e. by performing maximum likelihood estimation. In this report, we develop a Gibbs sampler for a fully Bayesian variant of the Bernoulli mixture, in which (conjugate) priors are introduced over both the mixing proportions and over the parameters of the Bernoulli distributions. We develop both a finite Bayesian Bernoulli mixture (using a Dirichlet prior over the latent class assignment variables) and an infinite Bernoulli mixture (using a Dirichlet Process prior). We perform experiments in which we compare the performance of the Bayesian Bernoulli mixtures with that of a standard Bernoulli mixture and a Restricted Boltzmann Machine on a task in which the (unobserved) bottom half of a handwritten digit needs to be predicted from the (observed) top half of that digit. The outline of this report is as follows. Section 2 describes the generative model of the Bayesian Bernoulli mixture. Section 3 described how inference is performed in this model using a collapsed Gibbs sampler. Section 4 extends the Bayesian Bernoulli mixture to an infinite mixture model, and described the requires collapsed Gibbs sampler. Section 6 presents the setup and results of our experiments on a handwritten digit prediction task.
منابع مشابه
Estimating Well-Performing Bayesian Networks using Bernoulli Mixtures
A novel method for estimating Bayesian net work (BN) parameters from data is pre sented which provides improved performance on test data. Previous research has shown the value of representing conditional probabil ity distributions (CPDs) via neural networks (Neal 1992), noisy-OR gates (Neal1992, Diez 1993) and decision trees (Friedman and Gold szmidt 1996). The Bernoulli mixture network (BM...
متن کاملA SAS Procedure Based on Mixture Models for Estimating Developmental Trajectories
This article introduces a new SAS procedure written by the authors that analyzes longitudinal data (developmental trajectories) by fitting a mixture model. The TRAJ procedure fits semiparametric (discrete) mixtures of censored normal, Poisson, zero-inflated Poisson, and Bernoulli distributions to longitudinal data. Applications to psychometric scale data, offense counts, and a dichotomous preva...
متن کاملBayesian and Iterative Maximum Likelihood Estimation of the Coefficients in Logistic Regression Analysis with Linked Data
This paper considers logistic regression analysis with linked data. It is shown that, in logistic regression analysis with linked data, a finite mixture of Bernoulli distributions can be used for modeling the response variables. We proposed an iterative maximum likelihood estimator for the regression coefficients that takes the matching probabilities into account. Next, the Bayesian counterpart...
متن کاملOn optimization, parallelization and convergence of the Expectation-Maximization algorithm for finite mixtures of Bernoulli distributions
This paper reviews the Maximum Likelihood estimation problem and its solution via the Expectation-Maximization algorithm. Emphasis is made on the description of finite mixtures of multi-variate Bernoulli distributions for modeling 0-1 data. General ideas about convergence and non-identifiability are presented. We discuss improvements to the algorithm and describe thoroughly what we believe are ...
متن کاملBaseline Mixture Models for Social Networks
Continuous mixtures of distributions are widely employed in the statistical literature as models for phenomena with highly divergent outcomes; in particular, many familiar heavytailed distributions arise naturally as mixtures of light-tailed distributions (e.g., Gaussians), and play an important role in applications as diverse as modeling of extreme values and robust inference. In the case of s...
متن کامل